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ABSTRACT 

In this paper, we investigate recommender systems from a net¬ 
work perspective and investigate recommendation networks, 
where nodes are items (e.g., movies) and edges are constructed 
from top-N recommendations (e.g., related movies). In partic¬ 
ular, we focus on evaluating the reachability and navigability 
of recommendation networks and investigate the following 
questions: (i) How well do recommendation networks support 
navigation and exploratory search? (ii) What is the influ¬ 
ence of parameters, in particular different recommendation 
algorithms and the number of recommendations shown, on 
reachability and navigability? and (iii) How can reachabil¬ 
ity and navigability be improved in these networks? We 
tackle these questions by first evaluating the reachability of 
recommendation networks by investigating their structural 
properties. Second, we evaluate navigability by simulating 
three different models of information seeking scenarios. We 
find that with standard algorithms, recommender systems 
are not well suited to navigation and exploration and pro¬ 
pose methods to modify recommendations to improve this. 
Our work extends from one-click-based evaluations of recom¬ 
mender systems towards multi-click analysis (i.e., sequences 
of dependent clicks) and presents a general, comprehensive 
approach to evaluating navigability of arbitrary recommen¬ 
dation networks. 
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1. INTRODUCTION 

An important use case of a recommender system is its 
ability to support browsing and navigation behavior. For 
example, we know that users enjoy perusing item collections 
without the immediate intention of making a purchase [10| . 
Flickr users have been found to predominately discover new 
images via social browsing [16| . For platforms where users 
immediately consume content, such as YouTube, recommen¬ 
dations serve the use case of unarticulated want, and are 
therefore a crucial part of user experience [^. More gener¬ 
ally, some users prefer navigation to direct search even when 
they know the target [^. In such exploratory scenarios, the 
knowledge gained along the way provides context and aids in 
learning and decision-making [24[|18| . 

Historically, there is only little research to assess the ability 
of recommender systems to aid such navigation. Yet, hyper¬ 
links in a recommender system are, by their very conception, 
meant to be navigated and used for search and exploration 


tasks. While a few studies have already looked at recommen¬ 
dation networks (see Figure]^ for an example) and provided 
first important insights into the nature and structure of these 
networks ^ , there is no systematic approach to evaluat¬ 

ing navigability both statically (reachability) and dynamically 
(navigability). 

Research questions. We address the problem of evaluat¬ 
ing both reachability and navigability of recommendation 
networks by investigating three research questions: 

1. How well do recommendation networks support naviga¬ 
tion and exploratory search? 

2. What is the influence of parameters, in particular dif¬ 
ferent recommendation algorithms and the number of 
recommendations shown, on reachability and navigabil¬ 
ity? 

3. How can reachability and navigability in recommender 
systems be improved? 

Approach. We study two types of recommendation net¬ 
works: (i) networks where links are generated from collabo¬ 
rative filtering algorithms (using explicit user ratings), and 
(ii) networks where links are generated from content (us¬ 
ing text similarity). We then use a two-level approach to 
evaluating reachability and navigability: 

1. Reachability: We evaluate reachability of recommen¬ 
dation networks by looking at the network topology: 
components, clustering, path lengths and bow-tie struc¬ 
ture. Analyzing the topological nature of recommen¬ 
dation networks provides us with first insights into the 
extent to which items are reachable. 

2. Navigability: We then evaluate practical navigability 
of recommendation networks using three different navi¬ 
gation models established in the literature: (i) Point- 
to-Point Search [13| as an example of goal-oriented 
navigation with a single fixed goal, (ii) Berrypicking 

as an example of goal-oriented navigation with multiple 
and variable goals, and (iii) Information Foraging 
as an example of exploration. 

Contributions. Our contributions are three-fold: 

First, we present a general approach for evaluating nav¬ 
igability of arbitrary recommendation networks via both 
topological analysis and navigation models (simulation), and 
demonstrate the feasibility of this approach by applying it to 
actual recommendation networks. 




Figure 1: Recommendation Network. Recommender sys¬ 
tems implicitly form recommendation networks, where nodes 
are items (e.g., movies) and edges are directed hyperlinks 
between related items. The illustration shows a scenario 
where two recommendations are available for each movie. 

Second, we find that the recommender systems we in¬ 
vestigate are poorly connected and not very well-suited for 
navigation and exploration. On our datasets, we find that rec¬ 
ommendation networks generated by collaborative-filtering al¬ 
gorithms perform better for most navigation scenarios. While 
this suggests that collaborative filtering is a better choice for 
the creation of navigable recommendations on our datasets, 
we leave the task of applying our approach to other (real- 
world) recommendation networks and datasets to future work. 

Third, we propose a series of simple changes to recom¬ 
mendation algorithms to introduce navigational diversity and 
demonstrate how they help to overcome the reachability prob¬ 
lem in recommender networks. 


2. RELATED WORK 

Network Analysis. Ever since Milgram’s small world ex¬ 
periments [19| , researchers have been making efforts to un¬ 
derstand navigability and in particular efficient navigation in 
networks. Kleinberg [11[ |13| and Watts [30| formalized the 
property that a navigable network requires short paths be¬ 
tween all (or almost all) nodes . Formally, such a network 
has a low diameter bounded by a polynomial in log{n), where 
n is the number of nodes in the network, and a giant com¬ 
ponent containing almost all the nodes exists [^. In other 
words, because the majority of network nodes are connected, 
it is possible to reach all or almost all of the nodes, given 
global knowledge of the network. This property is referred 
to as reachability. The low diameter and the existence of a 
giant component constitute necessary topological conditions 
for network navigability. In this paper we apply a set of 
standard network-theoretic measures including distribution 
of component sizes and component structure (via the bow tie 
model) to assess if a network satishes them. 

Kleinberg also found that an efficiently navigable network 
possesses certain structural properties that make it possible 
to design efficient local search algorithms (i.e., algorithms 
that only have local knowledge of the network) [^[^. The 
delivery time (the expected number of steps to reach an 
arbitrary target node) of such algorithms is then sub-linear 
in n. In this paper, we investigate the efficient navigability 
of networks through the simulation of a range of search and 
navigation models. 

Recommender Systems. Initially, recommender systems 
were mostly evaluated in terms of recommendation accuracy. 
More recently, the importance of evaluation metrics beyond 
accuracy has been identihed [10[ |27]. Approaches such as 


diversity (e.g., d). novelty (e.g , [21]}) and serendipity, which 
are thought to be orthogonal to the traditional accuracy- 


based evaluation measures, have been found to increase user 
satisfaction [^. Recommender systems have been found to 
show a Hlter bubble effect (even though following recommen¬ 
dations actually lessened the effect) [22| . Diversihcation of 
recommendations can be an effective means of increasing the 
spectrum of recommendations users are exposed to. 

In terms of reachability, the static topology of recommen¬ 
dation networks has been studied for the case of music rec- 
ommenders. Their corresponding recommendation networks 
have been found to exhibit heavy-tail degree distributions and 
small-world properties [^, implying that they are efficiently 
navigable with local search algorithms. Seyerlehner et al. 
studied sources (nodes that are never recommended) in music 
recommendation networks [25| and found that the fraction 
of sources remained constant independent of the recommen¬ 
dation approach and the network size. This indicates that 
recommendation networks generally suffer from reachability 
problems. Celma and Herrera found that collaborative 
hltering on last.fm led to recommendation networks that 
are prone to a popularity bias, with recommendations bi¬ 
ased towards popular songs or artists. They also found that 
collaborative hltering provided the most accurate recommen¬ 
dations, while at the same time this made it harder for users 
to navigate to items in the long tail. A hybrid approach and 
content-based methods provided better novel recommenda¬ 
tions. These results suggest that a trade-off exists between 
accuracy and other evaluation metrics. Mirza et al. [20| 
proposed to measure reachability in the bipartite recommen¬ 
dation graph of users and items as an evaluation measure. 

A simple method to improve reachability is to select recom¬ 
mendations specihcally for their target in the network, which 
has been proposed by Seyerlehner et al. [26| . In this work, 
we improve on this method by proposing a method that not 
only improves reachability but also ensures the relevance of 
the selected recommendations. 

While these analyses have shown certain topological prop¬ 
erties such as heavy-tail degree distributions and small-world 
properties [^, we know very little about the dynamics of 
actually using recommendations to find navigational paths 
through a recommender system. 

3. METHODS & EXPERIMENTAL SETUP 

In the following, we briefly sketch our approach for assess¬ 
ing network navigability and describe the datasets and the 
methods we applied to generate recommendation networks. 
All datasets are publicly available. 

3.1 Item Datasets 

We look at two examplary types of items for our experi¬ 
ments: movies from MovieLens and books from BookCross- 
ing. 

MovieLens is a him recommender system by the Univer¬ 
sity of Minnesota. For this work, we used the MovieLens 
dataset consisting of one million ratingsj^from 6,000 users on 

4, 000 movies, where each user had rated at least 20 movies. 
From this, we used the 3, 640 movies that had corresponding 
Wikipedia articles. 

BookCrossing is a book exchange platform. For this work, 
we used a 2005 crawl of the website . As a preprocessing 
step, we filtered out implicit ratings and users which had 
rated fewer than 20 books, leaving us with 110,610 books. 
From these, we randomly sampled books until we were able to 

^ http: //grouplens.org/datasets / movielens 









match 3, 640 books to their corresponding Wikipedia articles. 
This left us with two datasets of the same size. 

Mapping to Wikipedia articles. For the computation of 
content-based recommendations, we mapped the movie and 
book titles to corresponding articles in the English Wikipedia 
and extracted the textual content. For the mapping we made 
use of the naming conventions for movies and books used 
on Wikipedia. For movies, the article title is generally the 
movie title itself (e.g., Alice in Wonderland). However, in 
case the title also refers to other media (such as the Lewis 
Caroll book), the Wikipedia title will be Alice in Wonderland 
(film). Should several movies with the same title exist, the 
Wikipedia title will be Alice in Wonderland (1951 film). The 
same conventions hold true for book titles. 

Genres. We clustered items based on genre information, 
which for MovieLens was supplied with the dataset. For 
BookCrossing, we used information provided by Google Search. 
As of early 2015, querying Google Search for e.g., Alice in 
Wonderland genres produces an infobox containing a range 
of standardized genre affiliations, which we extracted and 
added our dataset. A manual inspection of 50 randomly sam¬ 
pled books revealed that this was highly accurate in terms of 
precision. 

3.2 Building Recommendation Networks 

We calculated non-personalized collaborative-filtering and 
content-based recommendations for both datasets in the fol¬ 
lowing way: For the given set of items I and a similarity 
measure, we compute the pairwise similarities for all pairs of 
items i and j. For each item i £ I, we define the set of the 
top-N most similar items to i as Li,Ar. We then create a top- 
N recommendation network G{V,N,E), where V = I and 
E = {{u,v) \u £ I,v £ Lu,n}. We investigated values for N in 
[1, 20], which we consider a plausible range for recommender 
systems. This method leads to recommendation networks 
with constant outdegree and varying indegree-representing a 
typical setting for top-N recommendation networks. 
Collaborative Filtering Recommendations (CF). We 
used the user-rating matrices associated with the datasets 
to calculate non-personalized collaborative filtering recom¬ 
mendations. We considered the ratings for each item as a 
vector, from which we then calculated cosine similarities to all 
the other item vectors and presented the top-N most similar 
vectors as recommendations. 

Content-based Recommendations (CB). As a second 
recommendation approach, we calculated simple content- 
based recommendations. For this purpose, we computed the 
TF-IDF features on the Wikipedia articles corresponding to 
the items and then calculated the (non-personalized) cosine 
similarities between these feature vectors and used the top-N 
recommendations. 

3.3 Improving Reachability and Navigability 
through Diversification 

We experimented with three simple approaches to improve 
reachability and navigability. We applied these approaches to 
all top-N recommendation sets I/i,jv for N > 2 and used them 
to find a replacement for the least similar recommendation 
in Li.iv. 

• Random. Random graphs generally exhibit a small diam¬ 
eter. By replacing the last recommendation with a random 
item, this algorithm served as an upper bound in terms 
of achievable reachability improvement when replacing a 
single recommendation for every item. 


• Diversify. This approach aimed at diversifying the recom¬ 
mendation list. The replacing recommendation was chosen 
among the overall top-50 recommendations for the given 
item as the one maximizing the pairwise dissimilarity to 
the items already in the list-a procedure referred to as 
diversify by Ziegler et al. [^. The similarity measures in 
this case were the cosine similarity of the rating vectors or 
TF-IDF vectors. 

• Expanded Relevance (ExpRel) This method chose the 
additional recommendation among the overall top-50 rec¬ 
ommendations for the given item as the one maximizing 
the one-step expanded relevance, as proposed by Kiigrik- 
tung et al. [^. The algorithm ranks nodes by taking into 
account both the relevance of the potential node as well as 
the fraction of its one-hop neighborhood that is not already 
directly reachable via other recommendations. This ap¬ 
proach aims at adding diversity by providing a connection 
into a useful area of the recommendation graph. 

4. REACHABILITY 

In the following, we present the topological characteristics 
of the recommendation networks introduced in the previous 
section. We will focus on analyzing (i) effective reachability in 
terms of components and local clustering, (ii) efficient reacha¬ 
bility in terms of path lengths (eccentricity) and (iii) reachable 
partitions of the graph in terms of a bow tie model analysis. 
This provides us with insights into the topological nature 
of these networks and with general clues for the high-level 
navigation characteristics of such structures. 

4.1 Effective Reachability 

Description. The analysis of the size of the largest con¬ 
nected component and the distribution of component sizes 
and gives a direct answer to the reachability of a recommen¬ 
dation network. Determining the size of the largest strongly 
connected component is related to catalog coverage, but goes 
beyond this in that it measures the size of the largest subset 
of items that are not only recommended but also mutually 
reachable. The local clustering coefficient provides insight 
into the local topology and can give us hints about how 
globally observed components emerge. It is computed as 

^ 1 ^ \{U,k)£E\j,k£r{i)}\ 

^ |r(i)| (|r(i)| -1) ’ ^ ^ 

where r(j) is the set of nodes reachable from i. For example, 
strong local clustering with a high number of small compo¬ 
nents and the absence of a giant component would indicate 
that the network consisted of a high number of isolated caves 
that are not connected to each other [29| . In terms of rec¬ 
ommendation systems these can be groups of items that 
mutually recommend each other. This analysis can provide 
us with explanations for observed phenomena. 

Results and Interpretation. The size of the largest strongly 
connected components in the networks grew with N (the num¬ 
ber of recommendations pointing away from items)-see Figure 
For N > 7, the largest component contained at least 50% 
of the nodes for all networks. Collaborative filtering led to 
larger components than the content-based approach, with 
MovieLens having the largest component thereof. 

In real-world examples, the number of immediately visi¬ 
ble recommendations typically lies between four and twelve. 
For instance, Amazon recommends between five and eight 
items (depending on screen resolution), YouTube recommends 
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Figure 2: Reachability analysis in terms of components, clustering and eccentricity. The first two columns show 
the size of the largest strongly connected component in the graphs and the number of components present. We find that 
recommendation networks are not well-connected, but this improves with the application of diversification. The third column 
shows the clustering coefficient. Diversification led to larger components and lower clustering and thus potentially has the 
downside of potentially making navigability harder. The fourth column depicts the distribution of eccentricities (the maximum 
shortest distances) between all pairs of nodes in the largest strongly connected component for N = 5 recommendations. While 
eccentricity values are shorter for collaborative filtering (CF) than for content-based (CB) recommendation networks, distances 
are too long for navigation in real-world systems in both types of networks. Diversification approaches help In reducing path 
lengths. 


twelve videos and IMDb lists six related films. If our exam¬ 
ples generalize to these datasets, this comparison shows that 
standard recommendation approaches allow users to explore 
only around half the network by browsing. Even for 20 rec¬ 
ommendations, a great number of components still exists in 
the network in the range of 5-20% of the network nodes (e.g., 
for our networks with 3,640 nodes, 10% components translate 
to 364 components). 

The application of the three diversification algorithms to 
improve reachability overall had a positive effect. Replacing 
one recommendation with a random one led to a large increase 
in the size of the largest strongly connected component and 
to a largest component of 80% for 3 recommendations. As 
random connections are known to strongly increase reachabil¬ 
ity, we took this approach as our baseline. Both the Diversify 
and the ExpRel approaches led to an increase in the size 
of the largest component, with Diversify performing better. 
For top-5 recommenders, introducting one diverse (but still 
relevant) one leads to a largest component comprising 60-80% 
of all nodes, thus strongly improving reachability. 

The clustering coefficient correlated negatively with the size 
of the largest component: the smaller the largest component, 
the higher the clustering coefficient-indicating the existence of 
isolated but highly clustered “caves” of items. When all links 
of a set of nodes are confined within a smaller component, 
that component is necessarily clustered more strongly. This 
might make navigation within the smaller component easier 
at the cost of not being able to reach larger parts of the 
network outside the component. The application of the 
three diversification algorithms led to lower clustering in the 
network. This indicates that while these algorithms connect 
more nodes to the core of the network, they might render 
navigability more difficult by removing some of the more 
obvious connections. 

Findings. We combine a set of global (component sizes) 
and local (clustering) measures to assess reachability of rec¬ 
ommendation networks. For our datasets, recommendation 
networks are not well-connected, with between 20-40% of 
nodes residing outside of the giant component in hundreds of 


small disconnected components. However, reachability can 
be improved by replacing the least relevant recommendation 
for each node with a diversified one. Clustering measures 
indicate that there exists a trade-off between stronger cluster¬ 
ing (potentially facilitating navigation) and reachable parts 
of the network (where disconnectedness thwarts navigation). 

4.2 Efficient Reachability 

Description. As the second step in assessing reachability, 
we investigated how efficiently recommendation networks are 
reachable. In order to obtain insight into the lengths of the 
shortest paths, we examined the eccentricity distributions 
of the largest components. The eccentricity of a node i 
is the longest shortest path from i to any other node in 
the component. This provided us with a means of gauging 
navigability from the path lengths. 

Results and Interpretation. Figure plots the distri¬ 
bution of the eccentricity values of all nodes in the largest 
components for N = 5 recommendations (results for larger 
values of N were qualitatively similar). Overall, we find that 
collaborative filtering led to shorter paths than content-based 
recommendations. This was in spite of collaborative filtering 
having larger strongly connected component-a phenomenon 
that has been observed In many types of graphs |17| . 

The diameters of the networks (i.e., the maximum eccen¬ 
tricity) for = 5 ranged from 14 to 38. Large diameters raise 
the question of whether users would actually undergo click 
sequences of this length to navigate the items. PageRank 
calculations assume a teleporation factor of 15%, meaning 
that in 15% ot clicks, a user does not follow a link on a web 
page but teleports by manually typing in a new address or 
using a search engine. If we follow this model, then after 
five clicks the number of users following links within a rec- 
ommender system has reduced to 45% and further decreases 
to 20% after 10 clicks and 4% after 20 clicks. While these 
are average estimates for the general Web, they clearly point 
out that standard recommender systems are not sufficiently 
navigable for N = 5 recommendations. Furthermore these 
eccentricity values represent only lower bounds, as they would 
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Figure 3: Bow tie analysis of the networks. This figure plots the distributions of the component memberships according 
to the bow tie Model in the recommendations networks over N (the number of recommendations) for MovieLens (a) and 
BookCrossing (b). For small Ns, node memberships vary. With increasing N, most of the nodes are in the largest strongly 
connected component (SCC). 


require users to always proceed on the shortest possible paths. 
Analysis of Wiki game data, where players actively try to 
find shortest paths, has shown that humans need at least two 
clicks more on average [31| . 

A simple method to shrink the diameters is to increase 
the number of recommendations shown. And indeed we find 
that for increasing values of N, the eccentricity distributions 
shift towards smaller diameters. For N = 20, most nodes in 
the collaborative filtering networks are reachable within 10 
steps. However, 20 recommendations are considerably more 
than the five to twelve recommendations that most real-world 
systems use and potentially clutter user interfaces and make 
finding the right recommendation more difficult. 

We found that diversification approaches were a more suit¬ 
able means for reducing path lengths. Substituting a single 
recommendation shifted the eccentricity distributions closer 
towards shorter paths, while keeping recommendations rele¬ 
vant and the number of recommendations constant. We found 
that the Diversify approach led to the largest improvement 
in terms of efficient reachability. 

Findings. We examine the distributions of the longest short¬ 
est path between nodes in the largest components. For our 
datasets, we find that distances between nodes (up to 39 
hops) are too long for reasonably efficient navigation. Col¬ 
laborative filtering led to shorter paths than content-based 
recommendations. Diversification approaches prove to be a 
promising means to reducing path lengths. 

4.3 Partition Reachability 

Description. After the analysis of effective and efficient 
reachability, we can conclude that recommendation networks 
lack reachability in many cases. As the next step, we now 
investigate the reachability of different sections in these net¬ 
works. This analysis can give us more clarity as to what areas 
of the network are connected, and reveal one-way connections 
between parts of a recommender system. 

A prominent model for graph partitioning is the bow tie 
model [^, developed for the analysis of the Web. This model 
partitions a directed network into three major components: 
the largest strongly connected component (SCC), wherein 
all nodes are mutually reachable, a component of all nodes 
from which SCC can be reached (IN) and a component of 
all nodes reachable from SCC (OUT). Figure]^ shows the 
model in more details and explains the components. 


Results and Interpretation. Figurej^shows the partition¬ 
ing of the recommendation networks by the bow-tie model. 
While for a small number of recommendations (i.e., low N) 
component division is relatively diverse, after about five rec¬ 
ommendations all networks are partitioned mainly into IN 
and SCC. This implies that the network mainly consists of a 
strongly connected core and a partition leading to it. 

A detailed analysis of where links from IN component 
pointed to underlined this intuition: In all networks, more 
than two thirds of all links from items in IN pointed to the 
SCC. From a navigational perspective, this means that items 
in the SCC can be reached from any item, but items in IN 
are in most cases only reachable by direct selection, e.g., via 
search results. Note that with increasing N, nodes are bound 
to end up in the SCC as density increases. 

For collaborative filtering networks, the SCC components 
were larger than for the content-based networks. This is 
likely due to the different features used in the recommenda¬ 
tion generation. Collaborative filtering relied on ratings and 
connected items that were rated similarly, whereas content- 
based recommendations connected items based on their tex¬ 
tual similarity. As such, content-based similarity tended to 
connect items that are described with the same key words. 
For instance, James Bond movies mostly linked to movies of 
the same franchise for content-based recommendations. By 
contrast, for collaborative filtering, recommendations from 
James Bond movies were more diverse and featured some 
recommendations to other action movies. Thus, the content- 
based approach was more strongly clustered and led to a 
smaller core that was harder to reach-a fact also visible from 
the higher clustering coefficients for content-based networks 
(cf. Figure . 

The diversification approaches showed two effects: (i) they 
increased the size of SCC and decreased the size of IN and 
(ii) they strongly reduced the sizes of the rest of the compo¬ 
nents . In terms of reachability, this is a desirable effect, as 
it makes the core (SCC) reachable from a larger fraction of 
the items in the recommendation network. 

The collaborative filtering networks for BookCrossing in¬ 
cluded a relevant number of nodes in the out-component 
{OUT) and the out-tendril (TNOUT) of the network for 
N > 5. Figureshows a membership change analysis for two 
selected networks. While initially (for N = 1), a, significant 
portion of the network is present in the form of the out-tendril 






































Figure 4: Bow Tie Model. The bow tie model par¬ 
titions a network into a strongly connected component or 
core (see), flanked by IN, where nodes can reach the core 
but are not reachable from it and OUT, where nodes are 
reachable from the core but not vice versa. Further compo¬ 
nents are TUBE, providing an alternative route from IN to 
OUT and the TENDRILS (TLNN, TL^OUT) which contain 
nodes connected to IN and OUT which cannot reach the 
core. Remaining nodes are collected in OTHER. 

and the out-component, nodes from these components then 
pass into the core (SCO) with increasing N. For a smaller 
number of recommendations N two separate strongly con¬ 
nected components with different sizes exist: SCO and OUT. 
With an increasing number of recommendations, SCO at¬ 
tracted more items than OUT. In addition, recommendations 
from some of the items in the SCO pointed to OUT items 
(but not vice versa), which connected the two components. 
An explanation for this situation could be the average num¬ 
ber of ratings for items, which was significantly higher for 
SCO items and lower for items in the OUT component. As 
recommendations were calculated based on cosine similarity, 
items with few co-ratings were more likely to reciprocate 
their recommendations for other items with only few ratings, 
and popular items with many co-ratings were more likely to 
recommend other popular items. This made items in OUT 
more likely to remain in the component in case of collabora¬ 
tive Altering (the problem with co-ratings was not present 
for content-based recommendations). 

Findings. We analyze recommendation networks based on 
the bow tie model, which partitions networks into components 
based on reachability. We And that the networks consist 
mainly of a strongly connected core of popular items together 
with an IN component leading to the core. This implies 
that the core is reachable from most items. With diversified 
recommendations, networks have more components in the 
core and the IN components, thus making the network better 
connected. Constructing navigable recommender systems 
could potentially be facilitated with the help of a modified 
similarity measure which is less harsh on the number of total 
ratings per item. The bow-tie model could then be used to 
evaluate and select appropriate similarity measures. 

5. NAVIGABILITY 

In the first part of this article, we investigated the reach¬ 
ability of recommender systems. As the second step of our 
analysis, we now focus our attention on the dynamics of 
searching and navigating recommendation networks. In a 
typical information seeking model, users move from one item 
to another by traversing recommendations. This activity can 
be intertwined with using the search function-e.g., exploring 
the results, backtracking and taking another path through 
the recommendations, or simply entering a refined search 


N = 1 


N = 5 


N = 10 


N = 15 


N = 20 



Figure 5: Membership change analysis of the bow tie 
structnre of the collaborative filtering BookCrossing 
network. The figure plots the node membership changes 
between selected values of N (the number of recommenda¬ 
tions). While initially most nodes belong to the OTHER 
component, nodes move to IN and SCO with increasing N. 
As a particularity for this algorithm and dataset, the OUT 
component persists with increasing N. This is due to node 
intake from TL_OUT. 


query in the search field |32| . Browsing is an important use 
case of a recommender system, as many users And brows¬ 
ing pleasant [10| or use it to discover new content [16| . A 
deflning property of this process is that the knowledge users 
have about the network is generally local: users only know 
about the links emanating from the current node and have 
intuitions about where those links lead. 

Several information seeking models have been established 
in the literature to model navigation and exploration in 
information networks. In this paper we concentrate on three 
of these models: 


• Point-to-point Search [13| , 

• Berrypicking 
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We examine the navigability of recommendation networks by 
simulating scenarios based on these three information seek¬ 
ing models and by measuring the success of the simulations 
in achieving a given navigation goal. This analysis shows 
us to what extent recommendation networks are suitable 
for dynamic processes such as navigation and exploration. 
This evaluation goes beyond a standard one-click evaluation 
scenario in recommender systems-it is in particular an in¬ 
spection of the suitability of these networks to accommodate 
users in following several sequential recommendations, one 
after the other. 

For all simulations, we applied a greedy search mechanism. 
We assumed that a function existed for each node which 
we could evaluate for each outgoing link in reference to the 
current navigation goal. The simulation then always acted 
greedily and selected the link maximizing this function (or 
backtracked in case of a dead end). We ran simulations for a 
total of 50 steps per goal. In particular, the next node was 
selected greedily based on a background knowledge, which 
we represented as a matrix S, containing similarities between 
pairs of nodes (i.e., items). For a pair of nodes {i,j), where i 
is a candidate node and j is the target node, Sij represented 
the function value for the candidate node i. We used the 
following types of background knowledge: 


(i) Title'. The similarity matrix S contained the cosine 
similarity of the TF-IDF features for item titles. This 
represented the intuitions about navigation gained from 
looking at the titles of recommendation targets. 

(ii) Neighbors'. S held the cosine similarity of the vector 
of neighbors of nodes in the recommendation network. 
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Figure 6: Information Seeking Scenarios. This figures shows the three information seeking scenarios used in our analysis. 
The goal in Point-To-Point Search was to find a single target node. For Berrypicking, we clustered the networks and set the 
goal of finding any one node in four predetermined clusters (shown in gray). For Information Foraging, the goal was to find all 
nodes in one predetermined cluster. 
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Figure 7: Success ratios for the Point-To-Point search 
scenario for N — 5 and N = 20 recommendations. 

Success ratios were better for a larger number of recommen¬ 
dations and were also improved by diversification. Collabora¬ 
tive Filtering (CF) led to better results than Content-Based 
methods (CB). 

This represented intuitions about areas of the network- 
e.g., a science fiction film likely leads to more films of 
the same genre. 

(iii) Wikipedia neighbors: S contained the cosine similarity 
of the vector of neighbors of nodes in the Wikipedia net¬ 
work (with items mapped to Wikipedia articles). This 
represented similar intuitions as those for the Neigh¬ 
bors background knowledge, but with the information 
originating from an external body of knowledge. 

(iv) Shortest Paths (Optimal Solution): For the optimal 
solution, we used the matrix containing information on 
the shortest paths. 

(v) Random: This matrix consisted of all zeros (containing 
no background knowledge at all) which resulted in a 
random walk. 

We believe that these different types of knowledge represent 
a good inventory of possible intuitions for navigating recom¬ 
mendation networks. We found that the best-performing 
background knowledge leads to success ratios well within the 
baselines (the random walk and the shortest paths) in all cases. 
For the sake of brevity, we only report the best-performing 
approaches for each network in the following section. 

5.1 Point-To-Point Search 


Description. Point-To-Point search represents the task of 
finding a single target item in the recommendation network. 
We randomly sampled 1, 200 pairs of nodes from each network, 
not taking reachability into account. We then ran navigation 
simulations for all of these pairs, starting at the start node 
of the pair and with the objective of reaching the target 
node. As an example, in simulations with Title background 
knowledge, the next node to go to was always (greedily) 
chosen to be the neighboring node with the most similar title 
to the target. Note that the outcome was therefore affected 
by both the reachability and navigability of the network. 
Figure [^displays an example of a Point-To-Point scenario. 
Results and Interpretation. Figure [^displays the success 
ratio (i.e., the fraction of successful simulations), showing only 
the result with the best-performing background knowledge. In 
the case of MovieLens, this was the Title background knowl¬ 
edge and for BookCrossing the Neighbors knowledge. This is 
likely an artifact of the higher clustering in the BookCrossing 
networks, suggesting that these networks were better suited 
to guiding search with intuitions about the common Neigh¬ 
bors towards the target. In the case of MovieLens by contrast, 
the Title similarities proved to be better in achieving this, 
indicating that film titles in our datasets were more indicative 
of the general area in the network than book titles. The same 
ranking of background knowledge was also the case for the 
Berrypicking and the Information Foraging scenarios. 

Overall, performance with Point-To-Point search was not 
very satisfactory for most of the recommendation networks 
investigated. For N — 5 recommendations, only 2-6% of 
targets were found for standard recommendation algorithms, 
which was increased by up to 20% with diversification ap¬ 
proaches. Simulations on collaborative filtering networks 
generally outperformed those on the content-based networks. 
With increased number of recommendations we observed a 
significant performance gain. For N = 20 recommendations, 
8-40% of targets were found for standard algorithms, and 
up to 70% for diversified approaches. As in the analysis of 
reachability, we took the random diversification as an upper 
bound of the possible increase when substituting a single rec¬ 
ommendation. In contrast to reachability, for navigability the 
Diversify approach actually performed slightly worse than 
standard algorithms, while ExpRel led to an increase. This 
suggests that a trade-off exists between improving reachability 
and navigability. 

Findings. In evaluating the Point-To-Point Search scenario, 
we find that the networks are not well suited to this navigation 
model using standard algorithms. This situation can be some¬ 
what improved by increasing the number of recommendations 
and by applying diversification algorithms. 
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Figure 8: Success ratios for the Berrypicking scenario 
for N = 5 and = 20 recommendations. Success ratios 
were better for a larger number of recommendations and were 
also improved by diversification. Collaborative Filtering (CF) 
led to better results than Content-Based methods (CB). 


5.2 Berrypicking 

Description. Berrypicking is an information seeking model 
proposed by Marcia J. Bates [^, which regards information 
seeking as a dynamic and evolving process. By contrast, 
Point-To-Point Search might be regarded as a static model 
of information seeking where the information need remains 
constant throughout a complete navigation session. In Berryp¬ 
icking, the information need is evolving and can be satisfied 
by multiple pieces of information in a bit-at-a-time retrieval 
[^-an analogy to picking berries on bushes, where berries 
are scattered and must be picked one by one. 

Based on Berrypicking, we evaluated the following navi¬ 
gation scenario in our recommendation networks: We first 
created clusters of network nodes based on publication year 
and genres. We then chose all clusters with 3 to 30 nodes 
as input and randomly sampled a total of 1,200 subsets con¬ 
taining four cluster each. With these as input, we randomly 
choose one node from the first cluster as the starting point in 
the network. The objective of the scenario was then to reach 
an arbitrary node from the second cluster, followed by an 
arbitrary node from the third and finally an arbitrary node 
from the forth cluster. In this way, the scenario modeled 
the evolving stages of Berrypicking with information needs 
changing after every discovered item. 

The basic implementation of this scenario was the same 
as for Point-To-Point Search, using the same three types of 
background knowledge. A difference here,, however, was that 
the target of the navigation was now not a single node but 
the cluster centroid (using the average similarity of all nodes 
in the cluster). The next node to go to was then chosen as 
the node with the highest similarity to the centroid in every 
step. 

Results and Interpretation. Similar observations as those 
for the Point-To-Point Search can be made for the Berrypick¬ 
ing scenario: for a small number of recommendations, none 
of the networks performed well. In this case, the success 
ratio was the average percentage of targets found: 10% suc¬ 
cess for Berrypicking means that, on average, 10% of three 
targets (instead of one in the case of Point-To-Point Search) 
had been found. This implies that even though the success 
ratio was almost the same as for Point-To-Point Search more 


targets were found overall. This suggests that the recommen¬ 
dation networks we studied were better suited to supporting 
Berrypicking than Point-to-Point Search. 

Even with diversification approaches applied and Ai = 20 
recommendations, however, only 20-40% of scenarios were 
successfully completed. This indicates that while one or two 
clusters were found in the simulations, finding all clusters 
proved to be too difficult. For recommender systems, the 
combination of recommendations with an efficient search 
function is therefore vital to support information seeking and 
browsing. 

Findings. We find that for Berrypicking, a scenario repre¬ 
senting dynamic information search, was somewhat better 
supported than Point-to-Point Search. A high number of 
recommendations and diversification led to success ratios 
around 40%. 


5.3 Information Foraging 

Description. Information Foraging [23] is an information 
seeking theory inspired by Optimal Foraging Theory in nature, 
where organisms have adopted strategies maximizing energy 
intake per time unit. For instance, when foraging on a patch 
of food (e.g., apples on a tree), animals must decide when 
to move on to the next patch (e.g., if finding new apples 
on the current tree has become too strenuous or all apples 
have been consumed). In the 1990s, Peter Pirolli and others 
found that some of the same mechanisms hold good for 
human information seeking behavior, and that humans try 
to maximize the information gain per time unit. Information 
is perceived as occurring in patches, indicated by information 
scent [^. 

In a scenario based on Information Foraging, we model 
the scenario of depleting a patch of information in a recom¬ 
mendation network. We assume that a after using the search 
function and arriving at one of the nodes in an information 
patch (i.e., a cluster as defined in Berrypicking), the objec¬ 
tive is now to find all the other nodes in the patch-guided 
by information scent in terms of the background knowledge, 
which represents intuitions about items. 

The implementation of this scenario was very similar to 
the one used for Berrypicking: Navigation was directed not 
at single nodes but towards a cluster centroid. 

Results and Interpretation. The simulations for the In¬ 
formation Foraging scenario performed best in the MovieLens 
collaborative filtering network. This suggests that the clus¬ 
tering (which was performed based on year of publication 
and genres) was best represented in this network. Further¬ 
more, titles were generally more indicative of targets in this 
network, with Title being the best-performing background 
knowledge. While one could expect that this method of clus¬ 
tering would favor the content-based networks (as title and 
categories are part of the textual content), this was evidently 
not the case. Another possible reason for this behavior could 
be the number of components and the size of the largest 
connected component: the MovieLens collaborative-filtering 
network had the largest giant component and the smallest 
number of components. Evidently, if the task was to visit as 
many nodes as possible from a specific part of the network, 
the reachbility of that part of the network plays the most 
important role. In general, recommendation networks seem 
to suffer from the existence of large number of (almost) iso¬ 
lated “caves” that are only loosely connected to each other. In 
the MovieLens collaborative filtering network, this problem 
was to some extent solved by a larger number of recommen- 
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Figure 9: Success ratios for the Information Foraging 
Search scenario for N — 5 and N = 20 recommen¬ 
dations. Success ratios were better for a larger number 
of recommendations and were also improved by diversifica¬ 
tion. Collaborative Filtering (CF) led to better results than 
Content-Based methods (CB). 

dation links that better connect the “caves” and formed a 
larger connected component. This analysis shows that, apart 
from shortening the shortest paths in the recommendation 
networks, injecting more links may improve navigability by 
ensuring a more robust connectedness of the item “caves” and 
the networks in general. 

The success ratio for this scenario was again based on the 
total number of target nodes. With the best approaches find¬ 
ing 30-40% of nodes in 10 steps, this might already be to some 
extent satisfying for recommendation networks in exploratory 
scenarios. Diversification approaches showed only a small 
improvement in terms of success ratio. One possible expla¬ 
nation could be that the clustering coefficients, which were 
negatively affected by diversification, had a substantial influ¬ 
ence on the results. A diversification algorithm specifically 
connecting caves or clusters in the network could improve 
the results here, but is beyond the scope of this analysis. 
Findings. We find that for an explorative scenario based 
on Information Foraging, similar outcomes as for the other 
scenarios hold true: it was not supported very well in our 
datasets. This analysis reveals another important structural 
deficiency of recommender networks: poor connectivity be¬ 
tween item “caves”. In order to improve navigability of rec¬ 
ommendation networks this deficiency could be overcome by 
specifically connecting components. 

6. DISCUSSION 

In this work, we evaluated recommendation networks in 
the context of reachability and navigation dynamics. We 
presented a general approach and applied it to two networks. 
We found that recommendation networks created with stan¬ 
dard recommendation algorithms did not fare well in terms 
of reachability nor navigability, and showed that this could 
be improved with diversification approaches. 

The navigation models applied in this approach are largely 
well-established in the research community and cover a wide 
range of typical user interaction scenarios with information 
systems in general, and recommender systems in particular. 
Greedy search, the basis for our navigation scenarios based 
on these models, has been used in previous work to analyze 


navigation dynamics in networks |15| . The navigation 
models were deliberately kept simple, as the focus of our 
work was not on the information seeking models and their 
validity but on the properties of recommendation networks. 
Our evaluation approach does not depend on a particular 
model, which can be adapted or exchanged in future work. 
Possible enhancements include teleportation to model the 
interplay with search function, stochastic instead of deter¬ 
ministic action selection, or a learning component, e.g., for 
memorizing preferred paths. 

The collaborative filtering approach used in the generation 
of the networks is very basic. Expanding this work to person¬ 
alized recommendations would represent a logical expansion 
to our work. It must be noted, however, that this would 
also lead to distinct networks for each and every user and 
would require larger-scale analysis. In using non-personalized 
collaborative filtering, we effectively inspect recommenda¬ 
tion networks for users who are new to the system or simply 
browsing the page without being registered-having assigned 
no ratings, the system can only suggest the globally most 
similar items. 

Additional future work could include content-based recom¬ 
mendations based on more elaborate features. As of now, 
collaborative filtering features appear to lead to a better 
comprehension of items, as suggested by the topology and 
navigation results. In future work, other, more sophisticated 
text feature algorithms such as LSA or LDA could be used to 
possibly improve on this. However, our evaluation approach 
is general enough to accommodate all these future extensions. 

We investigated three distinct diversification algorithms 
as possible improvements to reachability and navigability. 
In the analysis of reachability, we found that the Diversify 
method performed best and was close to the addition of a 
random recommendation, while at the same time ensuring 
the relevance of the included recommendation. By contrast, 
for navigability Diversify mostly led to a slight decrease in 
success, whereas the ExpRel approach was able to improve 
results. This results point to differences between reachability 
and navigability of recommender systems. A more detailed 
investigation will be necessary in the future to combine the 
best elements of both approaches and develop a diversification 
measure supporting both reachability and navigability. 

Another approach to improving navigability would be to 
increase the number of recommendations shown. However, 
this risks cluttering the user interface with too many rec¬ 
ommendations and is therefore mostly avoided in real-world 
recommender systems. An alternative could be to keep the 
number of recommendations constant but to add more diversi¬ 
fication and evaluate the fraction of diversity and navigability 
that users are willing to trade in for accuracy. 

7. CONCLUSIONS 

We have presented a general approach to evaluating the 
reachability and navigability of arbitrary recommendation 
networks. Our approach is based on an evaluation conducted 
on two levels: First we evaluate the topology of recommen¬ 
dation networks by looking at components, eccentricity and 
bow-tie structure. Second, we evaluate the dynamics of recom¬ 
mendation networks by simulating three different navigation 
models, namely Point-To-Point Search, Information Foraging 
and Berrypicking. We applied this approach to two datasets 
and found that reachability and navigability was not well- 
supported for standard recommendation algorithms. None 
of the recommendation networks was navigable in any of the 




























scenarios, if plausible and practical constraints are applied. 
We investigated possible improvements for this and found 
that the results could be improved with simple diversification 
approaches. 

We find that in our datasets, collaborative-filtering per¬ 
formed better than a content-based approach, suggesting that 
exploiting the collective knowledge present in ratings leads 
to more easily navigable recommender systems. While the 
results of our experiments are limited to the datasets under 
investigation, our approach to evaluating the navigability 
of recommendation networks is general. It can be applied 
to arbitrary recommendation networks, thereby acting as 
a novel tool of measurement for an increasingly important 
dimension of recommendation systems. We hope that our 
work stimulates more research on evaluating and ultimately 
improving the navigability of recommendation systems and 
corresponding algorithms. 
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